Algorithm and Theoretical Analysis for Domain Adaptation Feature Learning with Linear Classifiers

نویسندگان

  • Wenhao Jiang
  • Feiping Nie
  • Korris Fu-Lai Chung
  • Heng Huang
چکیده

Domain adaptation problem arises in a variety of applications where the training set (source domain) and testing set (target domain) follow different distributions. The difficulty of such learning problem lies in how to bridge the gap between the source distribution and target distribution. In this paper, we give an formal analysis of feature learning algorithms for domain adaptation with linear classifiers. Our analysis shows that in order to achieve good adaptation performance, the second moments of source domain distribution and target domain distribution should be similar. Based on such a result, a new linear feature learning algorithm for domain adaptation is designed and proposed. Furthermore, the new algorithm is extended to have multiple layers, resulting in becoming another linear feature learning algorithm. The newly introduced method is effective for the domain adaptation tasks on Amazon review dataset and spam dataset from ECML/PKDD 2006 discovery challenge.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Theoretic Analysis and Extremely Easy Algorithms for Domain Adaptive Feature Learning

Domain adaptation problems arise in a variety of applications, where a training dataset from the source domain and a test dataset from the target domain typically follow different distributions. The primary difficulty in designing effective learning models to solve such problems lies in how to bridge the gap between the source and target distributions. In this paper, we provide comprehensive an...

متن کامل

Sample-oriented Domain Adaptation for Image Classification

Image processing is a method to perform some operations on an image, in order to get an enhanced image or to extract some useful information from it. The conventional image processing algorithms cannot perform well in scenarios where the training images (source domain) that are used to learn the model have a different distribution with test images (target domain). Also, many real world applicat...

متن کامل

Image alignment via kernelized feature learning

Machine learning is an application of artificial intelligence that is able to automatically learn and improve from experience without being explicitly programmed. The primary assumption for most of the machine learning algorithms is that the training set (source domain) and the test set (target domain) follow from the same probability distribution. However, in most of the real-world application...

متن کامل

Application of ensemble learning techniques to model the atmospheric concentration of SO2

In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...

متن کامل

Efficient Learning of Domain-invariant Image Representations

We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers. Specifically, we form a linear transformation that maps features from the target (test) domain to the source (training) domain as part of training the classifier. We optimize both the transformation and classifier parameters jointly, an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1509.01710  شماره 

صفحات  -

تاریخ انتشار 2015